DaC-Join: Dividing the Problem of Joining Tables for Conquering an Efficient and SSD-aware Join Operator

نویسندگان

  • Namom Alencar
  • José Maria Monteiro
  • Ângelo Brayner
چکیده

Solid state drives (SSDs) have emerged as an attractive alternative for storing large databases. A read operation on SSDs is faster than a write operation. However, database management systems (DBMSs) have been designed assuming read and write operations would be executed in the same amount of time (characteristic of hard drives HDDs). Thus, to fully exploit benefits provided by SSDs, components of DBMSs should be aware of read/write asymmetry. So, in this paper, we present a new join algorithm, denoted DaC-Join whose key goal is to reduce the amount of write operations. DaC-Join can reduce up to 97% the amount of write operations can be up to 81% faster than FlashJoin, a well-known join operator proposed to be deployed in SSDs. Resumo. As memórias de estado sólido (SSDs) tornaram-se uma alternativa muito atraente para armazenar grandes bases de dados. Neste tipo de memória, uma operação de leitura é mais rápida do que uma operação de escrita. Entretanto, os Sistemas de Gerenciamento de Bancos de Dados (SGBDs) foram criados assumindo que uma operação de leitura e escrita possuem o mesmo custo de execução (caracterı́stica dos discos rı́gidos). Assim, para explorar plenamente os benefı́cios proporcionados pelos SSDs, os componentes do SGBDs precisam ser reescritos levando em consideração a assimentria entre as operações de leitura e escrita. Neste artigo, apresentamos um novo algoritmo de junção, chamado DaC-Join cujo objetivo principal é reduzir a quantidade de operações de escrita. O DaC-Join pode reduzir em até 97% a quantidade de operações de escrita e pode ser até 81% mais rápido do que FlashJoin, o principal algoritmo de junção concebido especificamente para memórias SSDs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SEMA-JOIN: Joining Semantically-Related Tables Using Big Table Corpora

Join is a powerful operator that combines records from two or more tables, which is of fundamental importance in the field of relational database. However, traditional join processing mostly relies on string equality comparisons. Given the growing demand for adhoc data analysis, we have seen an increasing number of scenarios where the desired join relationship is not equi-join. For example, in ...

متن کامل

TAC: A Topology-Aware Chord-based Peer-to-Peer Network

Among structured Peer-to-Peer systems, Chord has a general popularity due to its salient features like simplicity, high scalability, small path length with respect to network size, and flexibility on node join and departure. However, Chord doesn’t take into account the topology of underlying physical network when a new node is being added to the system, thus resulting in high routing late...

متن کامل

Mining Association Rules from Stars

Association rule mining is an important data mining problem. It is found to be useful for conventional relational data. However, previous work has mostly targeted on mining a single table. In real life, a database is typically made up of multiple tables and one important case is where some of the tables form a star schema. The tables typically correspond to entity sets and joining the tables in...

متن کامل

Efficient Index-based Processing of Join Queries in DHTs

Massively distributed applications require the integration of heterogeneous data from multiple sources. Peer-to-peer (P2P) is one possible network model for these distributed applications and among P2P architectures, distributed hash table (DHT) is well known for its routing performance guarantees. Under a general distributed relational data model, join query operator, an essential component to...

متن کامل

On Joining Graphs

In the graph database literature the term “join” does not refer to an operator used to merge two graphs. In particular, a counterpart of the relational join is not present in existing graph query languages, and consequently no efficient algorithms have been developed for this operator. This paper provides two main contributions. First, we define a binary graph join operator that acts on the ver...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016